Clustered Matrix Approximation

نویسندگان

  • Berkant Savas
  • Inderjit S. Dhillon
چکیده

In this paper we develop a novel clustered matrix approximation framework, first showing the motivation behind our research. The proposed methods are particularly well suited for problems with large scale sparse matrices that represent graphs and/or bipartite graphs from information science applications. Our framework and resulting approximations have a number of benefits: (1) the approximations preserve important structure that is present in the original matrix; (2) the approximations contain both global-scale and local-scale information; (3) the procedure is efficient both in computational speed and memory usage; and (4) the resulting approximations are considerably more accurate with less memory usage than truncated SVD approximations, which are optimal with respect to rank. The framework is also quite flexible as it may be modified in various ways to fit the needs of a particular application. In the paper we also derive a probabilistic approach that uses randomness to compute a clustered matrix approximation within the developed framework. We further prove deterministic and probabilistic bounds of the resulting approximation error. Finally, in a series of experiments we evaluate, analyze, and discuss various aspects of the proposed framework. In particular, all the benefits we claim for the clustered matrix approximation are clearly illustrated using real-world and large scale data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustered low rank approximation of graphs in information science applications

In this paper we present a fast and accurate procedure called clustered low rank matrix approximation for massive graphs. The procedure involves a fast clustering of the graph and then approximates each cluster separately using existing methods, e.g. the singular value decomposition, or stochastic algorithms. The cluster-wise approximations are then extended to approximate the entire graph. Thi...

متن کامل

Compact Representation of Reflectance Fields using Clustered Sparse Residual Factorization

We present a novel compression method for fixed viewpoint reflectance fields, captured for example by a Light Stage. Our compressed representation consists of a global approximation that exploits the similarities between the reflectance functions of different pixels, and a local approximation that encodes the per-pixel residual with the global approximation. Key to our method is a clustered spa...

متن کامل

Parallel Clustered Low-Rank Approximation of Graphs and Its Application to Link Prediction

Social network analysis has become a major research area that has impact in diverse applications ranging from search engines to product recommendation systems. A major problem in implementing social network analysis algorithms is the sheer size of many social networks, for example, the Facebook graph has more than 900 million vertices and even small networks may have tens of millions of vertice...

متن کامل

Fast and Accurate Low Rank Approximation of Massive Graphs

In this paper we present a fast and accurate procedure called clustered low rank matrix approximation for massive graphs. The procedure involves a fast clustering of the graph and then approximating each cluster separately using existing methods, e.g. the singular value decomposition, or stochastic algorithms. The cluster-wise approximations are then extended to approximate the entire graph. Th...

متن کامل

CLSI: A Flexible Approximation Scheme from Clustered Term-Document Matrices

We investigate a methodology for matrix approximation and IR. A central feature of these techniques is an initial clustering phase on the columns of the term-document matrix, followed by partial SVD on the columns constituting each cluster. The extracted information is used to build effective low rank approximations to the original matrix as well as for IR. The algorithms can be expressed by me...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • SIAM J. Matrix Analysis Applications

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2016